This writing is a complimentary document to the power session workshop that was taught by Richard Johansen and Mark Chalmers during Data Day 2019 at the University of Cincinnati. It is written with the expectation that the user is familiar with using Rstudio. If you follow along with these instructions, you will be able to create better, interactive maps of social vulnerability at the county level. For a more in depth explanation as to how the data was retrieved, cleaned, and manipulated, please refer to the full R script called Mapping_Social_Vulnerability.R located in the Scripts folder of the GitHub repository. At this point, you might be asking yourself “What GitHub repository?”.
The original script, all of the data, the powerpoint presentation, and supplemental resources used in the workshop can be downloaded from the GitHub repository located at: https://github.com/RAJohansen/DataDay2019.
The first step towards creating these maps yourself is to clone that repository, extract the files, and open the R project file titled DataDay2019. This should open with Rstudio automatically. If it does not know to open with Rstudio, you will have to tell it to use that program.
Using the project file allows you to use the code that is in this script to read in the raw or clean data files without changing the directory structure. The code in this script is what was used in the workshop to create the following map examples.
The next step is to make sure the following packages are downloaded and installed in your R environment. To install these packages, run the following line of code. You might already have some (such as dplyr and tidyverse) and you should modify this to only download the packages you do not already have.
install.packages(c("tigris","tmap","tidyverse","tabulizer","dplyr","sf","leaflet"))
We will not necessarily be using all of these packages in the code included here, but if you want to follow each step in the original script you will need these packages. They are also very useful if you work with geospatial data and want to make better maps. If you only want to produce maps with the data that is provided here, you will need the following packages: dplyr, sf, tmap, and leaflet.
After the packages are installed, the next step is to use the library function to make sure we can use the functions these packages provide us. Run the following lines of code and we are ready to get started!
library(tigris)
library(tmap)
library(tidyverse)
library(tabulizer)
library(dplyr)
library(sf)
library(leaflet)
The University of South Carolina makes their county level social vulnerability data publicly available on their website. http://artsandsciences.sc.edu/geog/hvri/sovi%C2%AE-0
If you want to learn more about the theory of social vulnerability and how it is measured, please refer to the original paper titled “Social Vulnerability to Environmental Hazards” which is included in the resources folder of the GitHub repository.
The data that was used in the workshop and that is being used to make the maps in this document was scraped directly from the PDF on the website located at “http://artsandsciences.sc.edu/geog/hvri/sites/sc.edu.geog.hvri/files/attachments/SoVI_10_14_Website.pdf”.
Multiple students at the workshop found this demonstration of scraping data from a pdf on the web useful. The code written below comes directly from the Mapping_Social_Vulnerability.R script and shows how to use the extract_tables function to scrape the data from the pdf at that website. This is included as an example, but moving forward we will just be reading in the data without going into the retrieval and cleaning process. This is done to simplify the process as much as possible so we can focus our attention on producing the visualizations and learning the syntax of the mapping packages, tmap and leaflet. If you are curious about the data cleaning process, refer to the full Mapping_Social_Vulnerability.R script. You do not need to run the code in the chunk below to move forward with the rest of the process.
Sovi_table <- extract_tables("http://artsandsciences.sc.edu/geog/hvri/sites/sc.edu.geog.hvri/files/attachments/SoVI_10_14_Website.pdf")
# Lets use two more functions to convert the extracted table into a more usable and analysis friendly format
final <- do.call(rbind, Sovi_table[-length(Sovi_table)])
# Reformate table headers by dropping the first row
final <- as.data.frame(final[2:nrow(final), ])
# Lets lable the column names so they can merged with Census data
headers <- c('GEOID', 'State_FIP', 'County_FIP', 'County_Name', 'CNTY_SoVI',
'Percentile')
# Apply our names to the data frame
names(final) <- headers
# **NOTE** GEOID is the ID code for CENSUS data
# This is mandatory for the next section
### Step 4: Save the table as a csv
# This is helpful for reproducibility and eliminating redundancy
write.csv(final, file='Data/SoVI.csv', row.names=FALSE)
The csv output from the last line of code above is the data that is already saved in the repository you downloaded. Instead of repeating the work done above, just run the line of code directly below and you will read the cleaned data file into your environment.
df <- read.csv('Data/SoVI.csv')
The first five rows of the data frame should look like this:
## GEOID State_FIP County_FIP County_Name CNTY_SoVI Percentile
## 1 1005 1 5 Barbour -0.93 34.8
## 2 1027 1 27 Clay 0.15 52.8
## 3 1091 1 91 Marengo 1.40 73.2
## 4 1069 1 69 Houston 0.19 53.4
## 5 1019 1 19 Cherokee -1.46 26.9
## 6 1003 1 3 Baldwin -1.03 33.5
Now that we have the social vulnerability data, lets move forward and make some maps. Instead of looking at County level information for the entire country, we are going to narrow the scope down to one state. Our example state is Florida but the user could look at any state they please by making very slight modifications to the code.
The first line of code below reads in the census data from Florida. The “st_read”" function is from the sf package and is used for reading in geospatial data. The second line takes only the social vulnerability data from our entire data set with State_FIP = 12, which translates to Florida. The next line merges these two data sets together.
Counties_FL <- st_read('Data/Counties_FL.gpkg')
df_FL <- subset(df,State_FIP == "12")
FL_SoVI <- merge(Counties_FL,df_FL, by = "GEOID", all = FALSE)
Now that the spatial data and social vulnerability data is merged together, we can start making maps. This line of code uses base R’s most basic plotting function to produce our first map.
plot(FL_SoVI[1])
This map has some issues, but you should still be able to tell that it is the state of Florida. The color does not appear to be meaningful but we can fix that. This is a good start because it confirms we are working with the proper spatial data.
Now we are going to use the tmap package which uses the same “grammar of graphics” as ggplot, a common plotting package in R. The tmap package is specially designed for visualizing geographic and spatial data.
The first map will be basic and then we will add layers. Look closely at the syntax of the following lines of code to see what changes from example to example.
tm_shape(FL_SoVI) +
tm_fill()
To add county borders, add another argument using “tm_borders”.
tm_shape(FL_SoVI) +
tm_borders() +
tm_fill()
In order to add social vulnerability data, specify the data column in the “tm_fill” function. This will use the social vulnerability values to fill in the counties and let us visually explore trends.
tm_shape(FL_SoVI) +
tm_borders() +
tm_fill(col = "CNTY_SoVI")
By adding in the social vulnerability data, tmap automatically created a scale and break points. This shows anybody looking at the map the color gradation scale. Next we will customize this scale and manually define break points.
breaks = c(-6,-3,0,3,6)
tm_shape(FL_SoVI) +
tm_borders() +
tm_fill(col = "CNTY_SoVI",breaks = breaks)
The color scheme on this map goes against most people’s intuition. In the original data set, positive values correspond to higher vulnerability. Another way to think of this is that negative values correspond to higher resilience, or less vulnerability. We can flip the color scheme to better match our intuition by adding a negative sign to the palette argument. This has the effect of having the red colored counties correspond to the more vulnerable areas and the green colored counties correspond to the more resilient areas.
tm_shape(FL_SoVI) +
tm_borders() +
tm_fill(col = "CNTY_SoVI",breaks = breaks, palette = "-RdYlGn")
Just for fun, we can choose a completely different color palette and manually make the scale bar continuous. This color scheme has the effect of somewhat deemphasing the more relisient zones by having a dark color gradient occupy the entire negative value regime. It also strongly emphasizes the vulnerable zones by having a rapidly varying light color gradient in the positive regime.
tm_shape(FL_SoVI) +
tm_borders() +
tm_fill(col = "CNTY_SoVI", style = "cont", palette = "viridis")
Now it is time for some finishing touches. The code below adds a scale bar, a title, and a directional arrow showing North. You should tinker with some of the various values in the code below, such as legend.title.size or inner.margins, to see the effect it has on the resulting map.
tm_shape(FL_SoVI) +
tm_borders() +
tm_fill(col = "CNTY_SoVI", style = "cont", palette = "viridis") +
tm_layout(title = "Florida SoVI Vulnerability Index by County",
legend.outside = FALSE,
frame = TRUE,
inner.margins = 0.1,
legend.title.size = 1.5,
legend.text.size = 1.1) +
tm_compass(type = "arrow", position = c("right", "top"), size = 2) +
tm_scale_bar(breaks = c(0, 100, 200),size = 0.8)
The tmap package allows users to make much more sophisticated visualizations than base plot. Now that we know how to create these beautiful maps, the next step is to add the extra layer of interactivity so that users can better engage with the visualizations.
First, convert the static tmap to an interactive map using the leaflet package. The popup.vars argument in the tm_fill function allows us to define what information pops up when the user clicks on a given county. Go ahead and click on a county in the map below, it should give you the name and the specific social vulnerability value for that county. You should also see the name of any given county just by mousing over it.
map <- tm_shape(FL_SoVI) +
tm_borders() +
tm_fill(col = "CNTY_SoVI",
palette = "-RdYlGn",
id = "NAME",
popup.vars = c("NAME","CNTY_SoVI"))
tmap_leaflet(map)
Next we are going to create a more complex map with leaflet. We are also going to add population data and very approximate sea level rise predictions. These were made intentionally jagged to greatly reduce the file size.
The code below reads in the Florida population data that is included in the repository and merges it with the spatial data.
FL_pop <- read.csv("Data/FL_Population.csv")
FL_pop$NAMELSAD <- FL_pop$County
FL_SoVI <- merge(FL_SoVI,FL_pop, by = "NAMELSAD", all = FALSE)
Read the sea level data into the working environment as well so that it can be added to the visualizations.
FL_slr_10ft <- st_read("Data/FL_slr_10ft.gpkg")
The following code creates duplicate maps so we can do a side by side comparison of social vulnerability and population with the projected sea level rise overlayed. Try and look closely at the syntax to get an idea of what is going on.
facets <- c("CNTY_SoVI","Population")
map_facets <- tm_shape(FL_SoVI) +
tm_borders() +
tm_fill(col = facets,
palette = "-RdYlGn",
id = "NAME",
popup.vars = c("NAME","CNTY_SoVI", "Population")) +
tm_shape(FL_slr_10ft) +
tm_polygons(col = "blue", alpha = 0.75) +
tm_facets(nrow = 1, sync = TRUE, free.scales.fill =TRUE)
tmap_leaflet(map_facets)
Remember you can always check the documentation if you do not know what a specific function is doing. For example, to get more clarification on what tm_facets is doing, you can run the following line of code and it will open up the documentation.
??tm_facets
There is a lot to explore in the map above. For example, you can use the scroll wheel on your mouse to zoom in and out. You can also click on a county to get the vulnerability measure and we added the population to this pop out box as an additional example. On the left hand side of either of the maps is a box with 3 squares layered on top of each other. By clicking on that, you can toggle on or off the sea level projections, the county level vulnerability measurements, and change the background map.
For the last map, this code shows you how to change the basemap that is underneath the interactive map.
map_facets_base <- tm_basemap(leaflet::providers$Esri.WorldImagery) +
tm_shape(FL_SoVI) +
tm_polygons(facets) +
tm_borders() +
tm_fill(col = facets,
id = "NAME",
palette = "-RdYlGn",
popup.vars = c("NAME","CNTY_SoVI", "Population")) +
tm_shape(FL_slr_10ft) +
tm_polygons(col = "blue", alpha = 0.75) +
tm_facets(nrow = 1, sync = TRUE)
tmap_leaflet(map_facets_base)